AITopics | style space

Collaborating Authors

style space

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Exploring speech style spaces with language models: Emotional TTS without emotion labels

Chandra, Shreeram Suresh, Du, Zongyang, Sisman, Berrak

arXiv.org Artificial IntelligenceMay-18-2024

Many frameworks for emotional text-to-speech (E-TTS) rely on human-annotated emotion labels that are often inaccurate and difficult to obtain. Learning emotional prosody implicitly presents a tough challenge due to the subjective nature of emotions. In this study, we propose a novel approach that leverages text awareness to acquire emotional styles without the need for explicit emotion labels or text prompts. We present TEMOTTS, a two-stage framework for E-TTS that is trained without emotion labels and is capable of inference without auxiliary inputs. Our proposed method performs knowledge transfer between the linguistic space learned by BERT and the emotional style space constructed by global style tokens. Our experimental results demonstrate the effectiveness of our proposed framework, showcasing improvements in emotional accuracy and naturalness. This is one of the first studies to leverage the emotional correlation between spoken content and expressive delivery for emotional TTS.

emotion label, speech, style space, (15 more...)

arXiv.org Artificial Intelligence

2405.11413

Country:

North America > Canada > Quebec > Montreal (0.05)
North America > United States > Texas > Dallas County > Dallas (0.04)
Asia (0.04)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Synthesis (0.52)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Enhancing Industrial Transfer Learning with Style Filter: Cost Reduction and Defect-Focus

Li, Chen, Ma, Ruijie, Qian, Xiang, Wang, Xiaohao, Li, Xinghui

arXiv.org Artificial IntelligenceMar-25-2024

Addressing the challenge of data scarcity in industrial domains, transfer learning emerges as a pivotal paradigm. This work introduces Style Filter, a tailored methodology for industrial contexts. By selectively filtering source domain data before knowledge transfer, Style Filter reduces the quantity of data while maintaining or even enhancing the performance of transfer learning strategy. Offering label-free operation, minimal reliance on prior knowledge, independence from specific models, and re-utilization, Style Filter is evaluated on authentic industrial datasets, highlighting its effectiveness when employed before conventional transfer strategies in the deep learning domain. The results underscore the effectiveness of Style Filter in real-world industrial applications.

dataset, style filter, target domain, (14 more...)

arXiv.org Artificial Intelligence

2403.16607

Country: Asia > China > Guangdong Province > Shenzhen (0.04)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.83)

Add feedback

Face Identity-Aware Disentanglement in StyleGAN

Suwała, Adrian, Wójcik, Bartosz, Proszewska, Magdalena, Tabor, Jacek, Spurek, Przemysław, Śmieja, Marek

arXiv.org Artificial IntelligenceSep-21-2023

Conditional GANs are frequently used for manipulating the attributes of face images, such as expression, hairstyle, pose, or age. Even though the state-of-the-art models successfully modify the requested attributes, they simultaneously modify other important characteristics of the image, such as a person's identity. In this paper, we focus on solving this problem by introducing PluGeN4Faces, a plugin to StyleGAN, which explicitly disentangles face attributes from a person's identity. Our key idea is to perform training on images retrieved from movie frames, where a given person appears in various poses and with different attributes. By applying a type of contrastive loss, we encourage the model to group images of the same person in similar regions of latent space. Our experiments demonstrate that the modifications of face attributes performed by PluGeN4Faces are significantly less invasive on the remaining characteristics of the image than in the existing state-of-the-art models.

plugen4face, style code, stylegan, (16 more...)

arXiv.org Artificial Intelligence

2309.12033

Genre: Research Report (0.90)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Vision (0.90)

Add feedback

Make It So: Steering StyleGAN for Any Image Inversion and Editing

Bhattad, Anand, Shah, Viraj, Hoiem, Derek, Forsyth, D. A.

arXiv.org Artificial IntelligenceApr-27-2023

StyleGAN's disentangled style representation enables powerful image editing by manipulating the latent variables, but accurately mapping real-world images to their latent variables (GAN inversion) remains a challenge. Existing GAN inversion methods struggle to maintain editing directions and produce realistic results. To address these limitations, we propose Make It So, a novel GAN inversion method that operates in the $\mathcal{Z}$ (noise) space rather than the typical $\mathcal{W}$ (latent style) space. Make It So preserves editing capabilities, even for out-of-domain images. This is a crucial property that was overlooked in prior methods. Our quantitative evaluations demonstrate that Make It So outperforms the state-of-the-art method PTI~\cite{roich2021pivotal} by a factor of five in inversion accuracy and achieves ten times better edit quality for complex indoor scenes.

artificial intelligence, inversion, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2304.14403

Country:

Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > Illinois > Champaign County > Urbana (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)

Genre: Research Report > Promising Solution (0.48)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

Matched sample selection with GANs for mitigating attribute confounding

Singh, Chandan, Balakrishnan, Guha, Perona, Pietro

arXiv.org Artificial IntelligenceMar-24-2021

Measuring biases of vision systems with respect to protected attributes like gender and age is critical as these systems gain widespread use in society. However, significant correlations between attributes in benchmark datasets make it difficult to separate algorithmic bias from dataset bias. To mitigate such attribute confounding during bias analysis, we propose a matching approach that selects a subset of images from the full dataset with balanced attribute distributions across protected attributes. Our matching approach first projects real images onto a generative adversarial network (GAN)'s latent space in a manner that preserves semantic attributes. It then finds image matches in this latent space across a chosen protected attribute, yielding a dataset where semantic and perceptual attributes are balanced across the protected attribute. We validate projection and matching strategies with qualitative, quantitative, and human annotation experiments. We demonstrate our work in the context of gender bias in multiple open-source facial-recognition classifiers and find that bias persists after removing key confounders via matching. Code and documentation to reproduce the results here and apply the methods to new data is available at https://github.com/csinva/matching-with-gans .

arxiv preprint arxiv, celebrity, latent space, (15 more...)

arXiv.org Artificial Intelligence

2103.13455

Country:

North America > United States > California (0.04)
North America > United States > Massachusetts (0.04)
Asia > Middle East > Republic of Türkiye > Karaman Province > Karaman (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Vision > Face Recognition (0.69)

Add feedback

Automated outfit generation with deep learning

#artificialintelligenceNov-13-2020, 00:31:11 GMT

We developed a machine learning model which is capable of completing an outfit based on a given seed product. Here we give an overview of our model and some of the challenges we faced. We consider an outfit to be a set of fashion items which match stylistically and can be worn together. In order for the outfit to work, each item must be compatible with all other items. Our aim is to create a model which embeds each item in a latent style space such that for any two items the dot product (a measure of similarity) of their embeddings reflects their compatibility.

automated outfit generation, information, product type, (12 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.52)

Add feedback

DrumNet

#artificialintelligenceJan-17-2020, 16:54:08 GMT

Sony CSL Paris develops technology for AI-assisted music production. The goal is not to replace musicians, but to provide them with better tools to be more efficient in realizing their creative ideas. DrumNet is based on an artificial neural network which learns rhythmic relationships between different instruments and encodes these relationships in a 16-dimensional style space. A similar example is the Logic Pro X Drummer, allowing the user to specify the playing style by navigating a two-dimensional space. The difference of DrumNet to the Logic Pro X Drummer, however, is that it dynamically adapts to the existing music.

drum track, drumnet, music production, (3 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.62)

Add feedback

Fashion Outfit Generation for E-commerce

Bettaney, Elaine M., Hardwick, Stephen R., Zisimopoulos, Odysseas, Chamberlain, Benjamin Paul

arXiv.org Machine LearningMar-18-2019

Combining items of clothing into an outfit is a major task in fashion retail. Recommending sets of items that are compatible with a particular seed item is useful for providing users with guidance and inspiration, but is currently a manual process that requires expert stylists and is therefore not scalable or easy to personalise. We use a multilayer neural network fed by visual and textual features to learn embeddings of items in a latent style space such that compatible items of different types are embedded close to one another. We train our model using the ASOS outfits dataset, which consists of a large number of outfits created by professional stylists and which we release to the research community. Our model shows strong performance in an offline outfit compatibility prediction task. We use our model to generate outfits and for the first time in this field perform an AB test, comparing our generated outfits to those produced by a baseline model which matches appropriate product types but uses no information on style. Users approved of outfits generated by our model 21% and 34% more frequently than those generated by the baseline model for womenswear and menswear respectively.

data mining, machine learning, product type, (21 more...)

arXiv.org Machine Learning

1904.00741

Genre: Research Report (1.00)

Industry:

Information Technology > Services > e-Commerce Services (0.42)
Textiles, Apparel & Luxury Goods (0.36)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Data Science > Data Mining (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Anime Style Space Exploration Using Metric Learning and Generative Adversarial Networks

Xiang, Sitao, Li, Hao

arXiv.org Machine LearningMay-21-2018

Deep learning-based style transfer between images has recently become a popular area of research. A common way of encoding "style" is through a feature representation based on the Gram matrix of features extracted by some pre-trained neural network or some other form of feature statistics. Such a definition is based on an arbitrary human decision and may not best capture what a style really is. In trying to gain a better understanding of "style", we propose a metric learning-based method to explicitly encode the style of an artwork. In particular, our definition of style captures the differences between artists, as shown by classification performances, and such that the style representation can be interpreted, manipulated and visualized through style-conditioned image generation through a Generative Adversarial Network. We employ this method to explore the style space of anime portrait illustrations.

artificial intelligence, dimension, machine learning, (17 more...)

arXiv.org Machine Learning

1805.07997

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)

Add feedback

Deep Style Match for Complementary Recommendation

Zhao, Kui (Zhejiang University) | Hu, Xia (Hangzhou Science &amp) | Bu, Jiajun (Technology Information Research Institute) | Wang, Can (Zhejiang University)

AAAI ConferencesFeb-4-2017

Humans develop a common sense of style compatibility between items based on their attributes. We seek to automatically answer questions like "Does this shirt go well with that pair of jeans?" In order to answer these kinds of questions, we attempt to model human sense of style compatibility in this paper. The basic assumption of our approach is that most of the important attributes for a product in an online store are included in its title description. Therefore it is feasible to learn style compatibility from these descriptions. We design a Siamese Convolutional Neural Network architecture and feed it with title pairs of items, which are either compatible or incompatible. Those pairs will be mapped from the original space of symbolic words into some embedded style space. Our approach takes only words as the input with few preprocessing and there is no laborious and expensive feature engineering.

artificial intelligence, machine learning, representation, (20 more...)

AAAI Conferences

Workshops at the Thirty-First AAAI Conference on Artificial Intelligence

Country:

Asia > China > Zhejiang Province > Hangzhou (0.05)
Asia > Middle East > Jordan (0.04)

Industry: Retail > Online (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback